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Abstract: Consider two p-variate populations, not necessarily Gaussian, with covari¬ 
ance matrices Si and S 2 , respectively, and let and S 2 be the sample covariances 
matrices from samples of the populations with degrees of freedom T and n, respec¬ 
tively. When the difference A between Si and S 2 is of small rank compared to p,T 
and n, the Fisher matrix F = Sf^Si is called a spiked Fisher matrix. When p, T 
and n grow to infinity proportionally, we establish a phase transition for the extreme 
eigenvalues of F\ when the eigenvalues of A (spikes) are above (or under) a critical 
value, the associated extreme eigenvalues of the Fisher matrix will converge to some 
point outside the support of the global limit (LSD) of other eigenvalues; otherwise, 
they will converge to the edge points of the LSD. Furthermore, we derive central 
limit theorems for these extreme eigenvalues of the spiked Fisher matrix. The limit¬ 
ing distributions are found to be Gaussian if and only if the corresponding population 
spike eigenvalues in A are simple. Numerical examples are provided to demonstrate 
the finite sample performance of the results. In addition to classical applications of 
a Fisher matrix in high-dimensional data analysis, we propose a new method for the 
detection of signals allowing an arbitrary covariance structure of the noise. Simulation 
experiments are conducted to illustrate the performance of this detector. 
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1. Introduction 

Consider two p-variate populations with covariance matrices Si and S2, respectively, 
and let Si and S2 be the sample covariances matrices from samples of the populations with 
degrees of freedom T and n, respectively. Specifically, if both populations are Gaussian, T Si 
and nS'2 are distributed as Wishart Wp(T, Si) and Wp{n, S2), respectively. For testing the 
equality hypothesis Hq : Si = S2, the likelihood ratio statistic relies on the p characteristic 
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roots of the determinental equation 


\S^-IS2\=0, leR. (1.1) 

Here and throughout the paper, the determinant of a matrix A is denoted by either |H| or 
det(H). As a famous story in multivariate analysis of last century, the joint distribution of 
these characteristic roots for Gaussian populations was simultaneously and independently 
published in 1939 by R. A. Fisher, S. N. Roy, P. L. Hsu and M. A. Girshick. When S 2 is 
invertible, these roots are simply the eigenvalues of the matrix F = S^^Si, widely known 
as a Fisher matrix in the literature, which generalises the one-dimensional Fisher ratio. 

Another breakthrough is the work of Wachter (1980) where he finds a deterministic 
limit, the celebrated Wacheter distribution, for the empirical measure of these roots when 
the dimension p grows to inhnity proportionally to the degrees of freedom T and n (under 
the Gaussian assumption). Wachter’s result has been later extended to non-Gaussian pop¬ 
ulations in what is now called the random matrix theory and two early examples of such 
extensions are Silverstein (1985) and Bai et ah (1987) . It is also important to notice that 
the determinental equation (1.1) arises not only in the classical hypothesis testing prob¬ 
lem mentioned above, it indeed covers also similar equations arising in important fields 
of multivariate analysis such as discriminant analysis, canonical correlation analysis and 
MANOVA, see Wachter (1980). 

Needless to say that such limiting results allowing large values of dimension p comparable 
to the degrees of freedom (i.e. sample sizes) are going to have much impact on today’s high¬ 
dimensional data analysis. A particularly important question is to investigate the properties 
of the characteristic roots under an alternative of form 

Hi : Si = S 2 -|- A , (1-2) 

where A is a nonnegative definite matrix of rank M. When p, T and n are all large, 
the discrimination between the null hypothesis and the alternative is not difficult if the 
rank difference M is all large. The real challenge here lies in detecting a small rank-M 
alternative. In this perspective and assuming M is a hxed integer while p, T and n grow 
to infinity proportionally, the empirical measure of the p characteristic roots of (1.1) will 
be affected by a difference of order M/p which vanishes, so that its limit remains the same 
as in the null hypothesis, i.e. the Wachter distribution. In other words, such global limit 
from all the characteristic roots will be of little help for distinguishing the two hypotheses. 

It happens that the useful information to detect a small rank alternative is encoded 
in a few largest characteristic roots of (1.1). In a recent preprint Dharniawansa et al. 
(2014), by assuming both population are Gaussian and M = 1, these authors show that, 
when the norm of the rank-1 difference A [spike) exceeds a phase transition threshold, the 
asymptotic behaviour of the log-ratio of the joint density of these characteristic roots under 
a local deviation from the spike depends only on the largest characteristic root /p i and 
the statistical experiment of observing all the characteristic roots is locally asymptotically 
normal (LAN). As a by-product of their analysis, the authors also establish joint asymptotic 
normality of a few of the largest roots when the corresponding spikes in A (with M > 1) 
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exceed the phase transition threshold. As it can be guessed, the analysis given in this 
reference highly rely on the Gaussian assumption so that the joint density function of the 
characteristic roots has indeed an explicit form under both the null and the alternative, 
and the main results are obtained via an accurate analytic approximation of the log-ratio 
of these density functions when the dimension p, T and n grow to inhnity proportionally. 

Intrigued by these findings, in this paper, we explore the same questions for general 
populations without Gaussian assumption. It is thus apparent that the joint density of the 
characteristic roots no more exist and new techniques are needed to solve the questions. 
Our approach relies on the tools borrowed from the theory of random matrices. This theory 
is closely connected to modern high-dimensional statistics, and has provided in recent years 
many efficient estimation and testing procedures for high-dimensional data analysis. Ex¬ 
cellent introduction and surveys on this approach can be found in Bai (2005), Johnstone 
(2007), Johnstone and Titterington (2009) and Paul and Aue (2014). A methodology par¬ 
ticularly successful both in theory and applications within this approach relies on the spiked 
population model coined in Johnstone (2001). This model deals with one population only 
with a unit population covariance matrix Ip and the hypotheses are simply Hq : Tii = Ip 
versus Hi : = Ip + A where A is a rank-M difference as in (1.2). Again for small rank 

M, the discrimination between both hypotheses will rely on the extreme eigenvalues of 
the sample covariance matrix Si. Important results have been obtained in the last decade 
on the behaviour of these extreme eigenvalues. For example, the fluctuation of largest 
eigenvalues of a sample covariance matrix from a complex spiked Gaussian population is 
studied in Baik et al. (2005). These authors uncover a phase transition phenomenon: the 
weak limit and the scaling of these extreme eigenvalues are different depending on whether 
the eigenvalues of A [spikes) are above, equal or below a critical value, situations refereed 
as super-critical, critical and sub-critical, respectively. In Baik and Silverstein (2006), the 
authors consider the spiked population model with general populations (not necessarily 
Gaussian). For the almost sure limits of the extreme sample eigenvalues of Si, they find 
that if a population spike (in A) is large or small enough, the corresponding sample spike 
eigenvalues will converge to a limit outside the support of the limiting spectrum [outliers). 
In Paul (2007), a GLT is established for these outliers, i.e. the super-critical case, under the 
Gaussian assumption and assuming that population spikes are simple (multiplicity 1). The 
GLT for super-critical outliers with general populations and arbitrary multiplicity numbers 
is developed in Bai and Yao (2008). This theory has been later extended for generalised 
spiked population model in Bai and Yao (2012). 

In summary, from the perspective of spiked population model, the Fisher matrix F = 
S 2 ^Si under the alternative (1.2) can be viewed as a spiked Fisher matrix and it is im¬ 
portant to establish a theory for this two-population Fisher matrix F in the vein of the 
results discussed above on the one-population spiked covariance matrix Si. As said be¬ 
fore, in Dharmawansa et al. (2014), the authors have already identihed the transition 
phenomenon for the extreme eigenvalues under the Gaussian assumption, and these eigen¬ 
values are proved to be asymptotic normal assuming that the spike eigenvalues in A are 
simple. The main contributions of the paper are the following. We prove that this phase 
transition phenomenon for extreme eigenvalues of a spiked Fisher matrix is universal, valid 
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for general populations under some suitable moment conditions. Next, we provide a gen¬ 
eral CLT for the extreme sample eigenvalues of F in the super-critical regime: the limiting 
distributions are not necessarily Gaussian; they are Gaussian if and only z/the population 
spikes in A are simple. 

In addition to the motivations given so far on the importance of a spiked Fisher matrix, 
we are able to implement an application of the general theory developed in this paper in 
the context of a signal detection problem with a large number of detectors, see Section 7. 
Indeed, this problem has its own interests and even with quite limited experiments, we 
show that our implementation can lead to very reliable solutions. 

Finally, within the theory of random matrices, the techniques we use in this paper for 
spiked models are closely connected to other random matrix ensembles through the con¬ 
cept of small-rank perturbations. The goal is again to examine the effect caused on the 
extreme sample eigenvalues by such perturbations. Theories on perturbed Wigner ma¬ 
trices can be found in Peche (2006), Feral and Peche (2007), Gapitaine et al. (2009), 
Pizzo et al. (2013) and Renfrew and Soshnikov (2013). In a more general setting of hnite- 
rank perturbation including both the additive and the multiplicative one, point-wisely 
convergence of extreme eigenvalues is established in Benaych-Georges and Nadakuditi 
(2011) while their fluctuations are studied in Benaych-Georges et al. (2011). In addition, 
Benaych-Georges and Nadakuditi (2011) contain also results on spiked eigenvectors. 

The rest of the paper is organised as follows. First, the exact setting of the spiked Fisher 
matrix F = is introduced in Section 2. Then in Section 3, we establish the phase 

transition phenomenon for the extreme eigenvalues of F where the transition boundary 
is explicitly obtained. Next, GLTs for those extreme eigenvalues fluctuating around some 
outliers (i.e. the super-critical case) are established hrst in Section 4 for one group of 
sample eigenvalues corresponding to a same population spike, and then in Section 6 for all 
the groups jointly. Section 5 contains numerical illustrations that demonstrate the hnite 
sample performance of our results. In Section 7, we develop in details a signal detection 
technique with prewhitening. Proofs of the main theorems are included in these sections 
while some technical lemmas are postponed into the Appendix A. 

2. Spiked Fisher matrix and preliminary results 

In what follows, we will assume that S 2 = Ip. This assumption does not loss any gen¬ 
erality since the eigenvalues of the Fisher matrix F = Sf^Si are invariant under the 
transformation Si 1 -^ S 2 S 2 F 2 Also we will write Sp for Si to 

signify the dependence on the dimension p. Therefore, the sample covariance matrices Si 
and S 2 that make up the Fisher matrix F = Sf^Si are assumed to have the following 
structure. Let 



( 2 , 1 ) 


and 


^ • • -Wt) — {Wkl)l<k<p,l<l<T 


( 2 , 2 ) 
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be two independent arrays, with respective size pxn and p x T, of independent real-valued 
random variables with mean 0 and variance 1. The sample covariance matrix S 2 is 



(2.3) 


Next, Sp is a rank M perturbation of Ip] therefore, we can assume that it has the spiked 
structure of form 



(2.4) 


where Hm is a M x M covariance matrix, M being a fixed constant, containing k spike 
eigenvalues (a*), (oi, • • • , oi, • • • , , of respective multiplicity numbers (n*) {ni -|- 




■ ■ ■ + nk = M). That is, VLm = U diag(ai, • • • , Oi, • • • ,ak,--- , ak)U*, where U is a. M x M 
orthogonal matrix. Consider a sample xi, • • • ,xt of size T that can be expressed as xi := 


'Ly^wi and let X = (xi,... ,X'r) = . The sample covariance matrix Si is 



(2.5) 


1=1 


Throughout the paper, we consider an asymptotic regime of Marcenko-Pastur type, i.e. 


p A n A T —)■ 00 , Pp := p/n —)■ p G (0,1), and Cp:=p/T^c>0. (2.6) 


Recall that the empirical spectral distribution (ESD) of a p xp matrix A with eigenvalues 
{Aj} is the distribution P~^ X)j=iwhere da denotes the Dirac mass at a. Since the 
total rank M generated by the k spikes is fixed, the ESD of F will have the same limit 
(LSD) as there were no spikes. This limiting spectral distribution, the celebrated Wachter 
distribution, has been known for a long time. 

Proposition 2.1. For the Fisher matrix F = with the sample covariance matrices 

Si’s given in (2.3)-(2.5), assume that the dimension p and the two sample sizes n,T grow 
to infinity proportionally as in (2.6). Then almost surely, the ESD of F weakly converges 
to a deterministic distribution F^^y with a bounded support [bi,b] and a density function 
given by 



when hi < X <b , 
otherwise , 


(2.7) 


where 



2 



and b 


2 


( 2 . 8 ) 
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Furthermore, if c> 1, then Fc^y has a point mass 1 — 1/c at the origin. Also, the Stieltjes 
transform s{z) of F^^y equals: 


1 1 c{z{l — y) + 1 — c) + 2zy — c^/ (1 — c + z{l — y)Y — Az 

zc z 2zc{c + zy) 


z i [6,, 6], 
(2.9) 


Remark 2.1. Assuming both populations are Gaussian, (Wachter , 1980, Theorem 3.1) 
derives the limiting distribution for roots of the determinental equation , 

\TSi - x^iTSi + n^2)| = 0, x e M. 


The continuous component of the distribution has a compact support [A‘^,B‘^] with density 
function proportional to {(x — A^){B‘^ — x)}^/^/{x(l — x^)}. It can be readily checked that 
by the change of variable z = cx'^/{y{l — x^)}, the density of the continuous component 
of the LSD of F is exactly (2.7). The validity of this limit for general populations (non 
necessarily Gaussian) is due to Silverstein (1985) and Bai et al. (1987). 

For a complex number z ^ [bi,b], we define the following integrals with respect to Fc^y(x): 


s(z) := 
m2(z) ; 
m4(z) : 


= [ -^dF,^y(x) , 

J Z-X 

2dFc^y(x) 



mi(z) 

msiz) 


(z — x)2 

X 

(z — X 


dFc^y(x) , 
'^dFf.y(x) , 


( 2 . 10 ) 


3. Phase transition of the extreme eigenvalues of F" = S '2 


In this section, we establish a phase transition phenomenon for the extreme eigenvalues 
of F = that is, when a population spike a* with multiplicity n* is larger (or smaller) 

than a critical value, a packet of n* corresponding sample eigenvalues of F will jump 
outside the support [ 61 , b] of its LSD Fc^y and converge all to a fixed limit. Otherwise, these 
associated sample eigenvalues will converge to one of the edges &i and b. 

For notation convenience, let 7 = 1/(1 — y) G (1,cxd). Define the function 


0 (x) 


jx(x — 1 + c) 

X — 7 ’ 




(3.1) 


which is a rational function with a single pole 7 . An example is depicted in Figure 1 with 
parameters (c, y) = (|, |). The function has an asymptote of equation g(x) = 7 (x+c— 1 + 7 ) 
when |a:| —)■ 00 . 

By assumption, the k population spike eigenvalues {a,} are all positive and non unit. 
We order them with their multiplicities in descending order together with the p — M unit 
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Kuiirtion o(j') 



Figure 1: Example of the 0 function with {c,y) = (|, \) and pole 7 = 2 . The asymptote 
has equation y = 2x + The boundary points are ^4(0.450, 0.203) and 5(3.549,12.597) 
meaning that critical values for spikes are 0.450 and 3.549 while the support of the LSD is 
[0.203,12.597]. 


eigenvalues as 


ai = • • • = ai > 02 = • • • = 02 > • • • > Ofco = • • • = Ofeg > 1 = ■ ■ ■ = 1 > 

®fco + l ' ' ' ®fco + l ^ ^ Ctfc ■ ■ ■ (3-2) 


That is, fco of these spike eigenvalues are larger than 1 while the other k — ko are smaller. 
Let 


J^ 


[n-i +-h ni_i + 1,77-1 H-h Uj] , I <i < ko , 

[p- {ni-\ -h rifc) + 1, ,p - (uj+i H-h Uk)] , ko < i < k . 


Notice that the cardinality of each Ji is rij. Next, the sample eigenvalues {/pj} of the Fisher 
matrix S^^Si are also sorted in the descending order as Ip^i > lp ^2 > • • • > lp,p- Therefore, 
for each spike eigenvalue Oj, there are n* associated sample eigenvalues {/pj, j G Ji). 

Theorem 3.1. For the Fisher matrix F = with the sample covariance matrices Si’s 

given in (2.3)-(2.5), assume that the dimension p and the two sample sizes n,T grow to 
infinity proportionally as in (2.6). Then for any spike eigenvalue Oi (i = 1, - ■■ ,k), it holds 
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that for all j G Ji, Ip^ almost surely converges to a limit 

{ 0(ai), \ai-Y\ > Yy/c + y - cy , 
b, 1 < Oi < 7{1 + y/cT+y^^} , (3.3) 

fei, 7{1 - y/c + y - cy} < a* < 1 . 

Basically, the theorem establishes a phase transition phenomenon for the largest and 
smallest sample eigenvalnes of a Fisher matrix. Consider again the example shown in 
Fignre 1. The transition bonndary is indicated with the bonndary points A and B with 
respective coordinates 

- Vc + y - cy},bi) and Bi-yil + y/c + y - cy},b). 


When the spike is large enongh or small enongh, the corresponding sample eigenvalnes 
converge to 0(aj) located ontside the snpport [bi,b] of the LSD of F. Otherwise, they 
converge to one of its edges bi and b. 

It is worth observing that when y —)■ 0, the 4>{x) fnnction tends to the fnnction well- 
known in the literatnre for similar transition phenomenon of a spiked sample covariance 
matrix, i.e. 

cx 

lim (f)(x) = X-\ -, X ^ 1, (3.4) 

y^o X — 1 


see e.g. the "^-fnnction on Fignre 4 of Bai and Yao (2012). These fnnctions share a same 
shape; however the pole here eqnals 7 = 1/(1 — ?/) which is larger than the pole 1 for the 
case of a spiked sample covariance matrix. 

As said in Introdnction, this transition phenomenon has already been established in 
a preprint Dharmawansa et al. (2014) (their Proposition 5) nnder Ganssian assnmption 
and nsing a completely different approach. Theorem 3.1 proves that snch a phase transition 
phenomenon is indeed nniversal. 

Proof, (of Theorem 3.1) The proof is divided into the following three steps: 

• Step 1: we derive the almost snre limit of an ontlier eigenvalne of 

• Step 2: we show that in order for the extreme eigenvalne of Sf^Si to be an ontlier, 
the popnlation spike shonld be larger (or smaller) than a critical valne; 

• Step 3: if not so, the extreme eigenvalne of Sf^Si will converge to one of the edge 
points b and bi. 

Step 1 : Let Ipj (j G Ji) be the ontlier eigenvalne of S '2 corresponding to the popnlation 
spike Oj. Then Ipj mnst satisfy the following eqnation: 




L - = 0 , 


and it is eqnivalent to 


"pj 


^2 




0 . 


(3.5) 
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Now we make some short-hands. Denote Z = 


Zi 


, where Zi is the n observations of its 


hrst M coordinates and Z 2 the remaining. We partition X accordingly as X = 


^2 


where Xi is the T observations of its hrst M coordinates and X 2 the remaining. Using 
such a representation, we have 


1 1 / XiX* XiX* 

T T I X 2 X* X 2 X* 


& = -ZZ* = - 


n 


n 


1^1 ^ 1^2 
'V '7^ '7 'Z* 

Zj2^\ ^2^2 


(3,6) 


Then, (3.5) could be written in the block form; 


- i y,.y; - i Yi.Y) 

^P>j 'V V* 1 "V "V* 'V V* 1 "V 

— Z 2 Z 1 - 7^A2Ai —^2^2 - t^2A2 


= 0 . 


(3.7) 


Since Ipj is an outlier, it holds \lpj ■ ^Z 2 Z 2 — y;X 2 X 2 | 7 ^ 0, and for block matrix, we have 
A B\ 

^ jj \ = detD ■ det(y4 — BD~^C) when D is invertible. Therefore, (3.7) reduces to 


det 


- IxiX* 

n 1 


77* ^ Y 

—Z 1 Z 2 - —A 1 A 2 

n I 


"P,J 77 * TV V* 

n I 


-w/, 




7 7 * Y Y* 

^ 2^1 - —A 2 A^ 
n I 


= 0 


More specihcally, we have 
det 


^P,j ry 
^1 


n 


/„ - zi(gp, - (lz7^)-‘Xx;)-\-z7^r‘^z, 

' n 1 ' n n 




(I) 


T 


Xi 


.-il 


.-ml 


It + X*2{lp,Ip - {-Z2Z*2)-^-X2X;)- {-Z2Z;)-^-X2 


n 


XI 


(II) 


^pj ry ry* ( 1 j / ^ ry ry*\—l 1 -y ry* \ — 1 ^ \r y* 

— Z,Z,{lp,Ip-[-Z 2 Z 2 ) -X 2 X 2 ) [-Z 2 Z 2 ) -X 2 X, 




'*■ Y Y* (1 T ( ^ 7 7 *\~I ^ Y W*') ^ 7 rz*\-l ^Pd 7 7* 

— A 1 A 2 - t-Z2Z2j —A2A2j t-^2^2j ^2^1 

I n 1 ' n n 


T 

Tm 


-ml 
n 




= 0 


(3.8) 


In all the following, we denote by S the Fisher matrix [IZ 2 Z 2 ') |;X 2 X 2 , which has a 

LSD Fc^y{x). And in order to hnd the limit of Ipj, we simply hnd the limit on the left hand 
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side of (3.8), then it will generate an equation. Solving this equation will give the value of 
its limit. 

First, consider the terms {III) and {IV). Since {Zi^Xi) is independent of (Z 2 ,X 2 ), using 
Lemma A. 2, we see these two terms will converge to some constant multiplied by the 
covariance matrix between Xi and Zi. On the other hand, Xi is also independent of Zi, 
we have 


Cov(A:i, Zi) = EAiZi - EXiEZi = EATiEZi - EATiEZi = 0 


MxM 


Therefore, these two terms will both tend to a zero matrix Omxm almost surely. 

So the remaining task is to find the limit of (/) and {II). We recall the expression of Xi 
and Zi that 

Cov(Ai) = U diag(ai, • • • , Oi, • • • , a^, • • • , ak)U* , Cov(Zi) = Im- 

ni rik 

According to Lemma A.2, we have 

Ir. 


(I) = ^Z: /„ - z;(i,,,g - sr^(hz2z;r'—Z2 


n 

—>• — <( Etr 
n 


n 


n 


ZI 


V —1 


4 - zi{\i, - s)-\-z,z;)-^^z^ 

n n 


■I 


M 


-^i(l T 2/'^i'S(Aj)) ■ Im i 

here, we denote A* as the limit of the outlier {/pj, j G Ji). For the same reason. 


(3.9) 


(I I) = -hxi 
1 


It + - S)-‘(iz 2 Z-)-Tx, 


X* 


—t 


T 


Etr 




It + Xl{\I, - S)-\-Z,Zl)-^-X^ 


1 '1 

' Oi ' 

]}■" 

\ Ofc / 


U* 


( 


— U — l-|-C-h cA25(A2)) * 


QjI 


\ 


u* 


(3.10) 


\ flfc / 


Therefore, combining (3.8), (3.9) and (3.10), we have the determinant of the following 
MxM matrix 


U 


^ Aj(l + 2 /Ajs(Ai)) + (—1 + c + cAjs(Aj))ai 0 ^ 

y 0 Aj(l + |/Ajs(Aj)) + (—1 + c + cAjs(Aj))afc j 


U* 
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equal to zero, which is also to say that Xi satisfies the equation; 


-^ 1(1 + + (~1 + c + cAjs(Aj))aj — 0 . (3-11) 


Finally, together with the expression of the Stieltjes transform of a Fisher matrix in (2.9), 
we have 


A. 


ai{ai + c - 1) 

at - ttiU -I 




where the function 0(a;) is defined in (3.4). 

Step 2 : Define 5 ( 2 ;) as the Stieltjes transform of the LSD of (^^ 2 ^ 2 ) ^^ 2 , who shares 
the same non-zero eigenvalues as 82 ^Si. Then we have the relationship: 

5 ( 2 ;) -I —(1 — c) = cs{z) . (3-12) 

z 

Recall the expression of s{z) in (2.9), we have 


= 


c{z{l — y) + 1 — c) + 2 zy — C\J (1 — c -1- z{l — y)Y — Az 


2 z{c + zy) 

On the other hand, due to (3.11) and (3.12), we have the value for s(Ai): 


(3.13) 


/^ X yc-y-c 

yXi + ttiC 

Since Aj is outside the support of the LSD, we have 


(3,14) 


-1 yc-y-c 


yXi ttiC 

which is also to say that 


Xi> b or s M ^ ^ ) = Ai < 61 

yXi -|- OjC 


or 


,,, yc — y — c 

s{b) < - , 

yXi -|- OjC 

(3.16) 

. yc-y-c 
s{bi) > 

yXi QiC 

(3.16) 


Then (3.15) says that s{b) must be smaller than the minimum value on its right hand side, 
whose minimum value is attained when Xi = b (the right hand side of (3.15) is a decreasing 
function of A*). Similarly, (3.16) says that s{bi) must be larger than the maximum value 
on its right hand side, which is attained when A* = &i. Therefore, the condition for A* be 
an outlier is: 


s{b) < 


yc-y-c 
yb + QiC 


or 


s{bi) > 


yc-y-c 

ybi ttiC ■ 


(3.17) 
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Finally, using (3.13) together with the value of b and &i, we have: 

1 + y/c + y - cy I - ^c + y - cy 

tti > -^-, or tti < -^-, 

1 - 1 / 1 - 1 / 

which is equivalent to say that (recall the expression of 7 that 7 = 1/(1 — y)): 

\ai - 7 I > 7\/c + y-cy . 

Step 3: In this step, we show that if the condition in Step 2 is not fulfilled, then the 
extreme eigenvalues of will tend to one of the edge points bi and b. For simplicity, 

we only show the convergence to the right edge b: the proof for the convergence to the left 
edge bi is similar. Thus suppose all the a* > 1 for i = 1, • • • ,k. For now, we make some 
short-hands. Let 


= lx x- = l( M 

T t[ x,X; X,X^ J ■ \B'n B^) 


q = 177- = i f 'l ■= 

n n ( Z2ZI Z-iZl j ' V'^21 ^.22) ’ 

where Bn and An are the corresponding blocks with size M x M. Using the inverse formula 
for block matrix, the (p — M) x (p — M) major sub-matrix of 82^ Si is 

— (A22 — ^21^1/^12) ^^2iAn Bi 2 + {A22 — ^21^1/^12) ^-^22 := C . ( 3 . 18 ) 


The part 

— {A22 — 742174^^/7412) ^742l74;^/i?i2 = —(7422 — 742174^^/7412) ^742174;^/ ■ —X1X2 
is of rank M; besides, we have 

tr I (7422 — 742174 ^/ 7412 ) ^742174]^/—X 1 X 2 j- —!■ 0 , 

since Xi is independent of X 2 . Therefore, the M nonzero eigenvalues of the matrix —(7422 — 
742174//74i2)“^742i74k/i?i2 will all tend to zero (so is its largest one). Then consider the 
second part of (3.18) as follows. 


7422 — 742174;^/74i2 — —Z 2 
n 


' n 


Zo ■ — — Z 2 PZ 2 
n 
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Since P = In — Zl ^^Zi is a projection matrix of rank p — M, it has the spectral 

decomposition: 


/ 0 


\ 


P = V 


0 


1 /* , 


V 



where V is a n x n orthogonal matrix. Since M is hxed, the ESD of P tends to (5i, which 
leads to the fact that the LSD of the matrix -ZoPZn is the standard Marcenko-Pastur law. 

n ^ z 

Then the matrix {^Z 2 PZ 2 )~^B 22 is a standard Fisher matrix, and its largest eigenvalnes 
(hnitely many) will tend to the right edge b of the Wachter distribution. It follows then 
the two largest eigenvalues of C, say ai{C) and 0 : 2 ( 6 '), also tend to b. 

Next since C is the (p — M) x (p — M) major sub-matrix of we have by Cauchy 

interlacing theorem 

^ 2 ( 6 ') < Ip^M+l < 0 : 1 ( 6 ') < Ip^i . 

Thus Ip^M+i —t b either. On the other hand, we have 



so that for some positive constant 6*, limsup/p^i < 9. Consequently, almost surely, 

b < liminf Ip^M < • ■ • < limsup /p i < 9 < 00 ; 

in particular the whole family {Ipj, 1 < j < M} is bounded. Now let 1 < j < M be hxed 
and assume that a subsequence {lpf.,j)k converges to a limit {3 G [b, 9]. Either {3 = (^{ai) > b 
OT (3 = b. However, according to Step 2, (3 > b implies that > 7(1 -|- y'c + y — cy}, and 
otherwise, we have Oj < 7{1 -l- y/c + y — cy}. Therefore, accordingly to one of these two 
conditions, all subsequences converge to a same limit ^(aj) or b, which is thus also the 
unique limit of the whole sequence {lpj)p- The proof of Theorem 3.1 is complete. □ 

4. Central limit theorem for the outlier eigenvalues of 

The aim of this section is to give a CLT for the Uj-packed outlier eigenvalues: 


y/P {^p,j ^ Ji} ■ 


Denote U = {Ui U 2 ■ ■ ■ Uk) , where each f/j is a M x n* matrix that corresponds to 
the rij-packed spike eigenvalue a*. 

Theorem 4.1. Assume the same assumptions as in Theorem 3.1 and in addition, the 
variables (zij) (in ( 2 . 1 )) and {wki) (in { 2 . 2 )) have the same first four moments and denote 
V 4 as their common fourth moment: 


V 4 ='E\zij\‘^ ='E\wki\‘^, 3<i,k<p, 1 < j < n, 1 < / < T. 
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Then for any population spike Oi satisfying \ai — 7 I > 7 \/c + y — cy, the normalised n*- 
packed outlier eigenvalues of Sf^Si: y/p {Ipj — (f){ai),j G Jj} converge weakly to the distri¬ 
bution of the eigenvalues of the random matrix —U*R{Xi)Ui/A{\i). Here, 


{I - Oi - c){l + ai{y - l)f 
(a,-l)(-l + 2a, + c + a2(|/-l)) ’ 


(4.1) 


R{Xi) = {Rmn) is a M X M symmetric random matrix, made with independent Gaussian 
entries of mean zero and variance 


where 


Yar{R,nn) 


29i + (^4 — 3 ) 0 ;* , m = n , 
9i , m^n , 


af{aiR c-lf{c + y) 

(a, - 1)2 

af{ai + c-lf{cy-c-y) 
-1 + 2aj + c + af{y - 1) 


Numerical illustrations of this theorem are detailed in the next section. 


(4.2) 


(4.3) 

(4.4) 


Remark 4.1. Notice that the result above involves the i-th block Ui of the eigen-matrix 
U. When the spike ai is simple, Ui is unigue up to its sign, then U*R{Xi)Ui is uniguely 
determined. But when a, has multiplicities greater than 1, Ui is not unigue; actually, any 
rotation of Ui can be an eigenvector matrix corresponding to a*. Therefore, Lemma A.l 
in the Appendix states that, such a rotation will not affect the eigenvalues of the matrix 
UlR{Xi)U. 

Proof, (proof of Theorem 4.1) 

Step 1: Convergence to the eigenvalnes of the random matrix —U*R{Xi)Ui/A{Xi). 
We start from (3.8). First we make some short hands. Define 


kl(A) = R- z; 
b{x) = It + x; 


Xh-{^Z2z;^ '^X2x; 


-1 


XL 


Z 2 Z; 


-1 1 


X 2 X* 


-Z 2 Z;) '-Z 2 , 
n / n 

1 \-i 1 

-ZaZ;) -X 2 

n V T ^ 


C(A) = Z* 
D{X) = X* 


\g- 

A/„- (IzjzA'Ta'zA'I 


-1 


n 


T 


WxL'fx.,. 

1 \-ll 

—Z2Z2) —Z2 , 

n / n 


then (3.8) could be written as 


det( ^ZiA(/pjZ*-ix 4 R(/pjX* + ^Z 4 C(/p,,)X* + ^XiD(/pjZ* ) = 0. 

-V-^ '-V-' -V-^ '-V- 

(z) {ii) {iii) [iv) 


(4.5) 


(4.6) 
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The remaining is to find second order approximation of the four terms on the left hand 
side of (4.6). 

Using Lemma A.5 in the appendix, we have 

(U = E—ZiA(XAZ: + AZ: - E—Z^A(XAZ: 

n n n 

= (A, + yXjs{Xi)) ■ Im + + ^ZiA(A,)^i - E^ZiA{Xi)Z 

n n n n 

= (A, + yX^,s{Xi)) ■ Im + + ^Z, {A{lpj) - A{X,)) Z* 


+ 


A* 


-^Z,A{X,)Zl - E^Z,A{Xi)Zl 
n ^/n 


+ 2/-^?'S(Aj)) • Im + {lp,j — Aj) ■ (1 + 2yXis{Xi) + X‘^ymi{Xi)) ■ / 


M 


A* 


^ZiA(A,)Z* - E^ZiA(Ai)Z* 
n \/n 


(4.7) 


(«) = EixiB(A.)Xr + hx,B{l,pXl - 1i,hx,B{Xi)Xl 


Oi 


= U{l- c - cXis{Xi)) 


\ 


J 


U- + yX,(B( 1 ,j) - B(A,))A'; 


yy L 4vB(A.)A 7 - E^AiB(Ai)A; 




U(l — c — cAjs(Aj)) 
1 r 1 


ai 


\ ! cbi 

U* - U{lpj - Xi) ■ cmsiXi) 


\ Clk J 


- E^X,B{X.)Xl 


\ 


\ Clk J 


U* 


(4.8) 


(Hi) = ^-^ZiC(LAX:-E—ZiC(Xi)X* 
n n 

= ^AhlZiC{lp,,)Xl - ^ZrC{Xi)Xl + ^ZiC'(A,)X* - E^ZiC'(A,)X* 
n n n n 


= - C{X,))Xl + 

n n 

A. 

-)■ — 


Z^C{Xi)Xl + ^ • ZiU(A,)X* - EZ^C{Xi)Xl 


n 


n 


ZiC(K)x; - EZiC(\)x; 


(4.9) 
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Denote 


Rn{K) = 



ZiA{X,)Z- 


1 



ZiC{Xi)X- 


1 



(4.11) 


where E[-] denotes the total expectation of all the preceding terms in the eqnation, and 
A(Ai) = 1 + 2yXis{Xi) + Xfymi{Xi) + acrusiXi) . 

Combining (4.6), (4.7), (4.8), (4.9), (4.10) and considering the diagonal block that corre¬ 
sponds to the row and colnmn index in Jj x Jj leads to: 


- Xi) ■ A(A,) ■ In, + [U*Rn{Xi)U], ^ 0 . 


(4,12) 


Furthermore, it will be established in Step 2 below that 

[U*Rn{Xi)U]i —;■ [U*R{Xi)U]i in distribution. 


(4.13) 


for some random matrix i?(Aj). Using the device of Skorokhod strong representation (Skorokhod , 
1956; Hu and Bai , 2014), we may assume that this convergence hold almost surely by con¬ 
sidering an enlarged probability space. Under this device, (4.12) is equivalent to say that 
\/p(/pj—Aj) tends to an eigenvalue of the matrix — [U*R{Xi)U]i/A{Xi){= —U*R{Xi)Ui/A{Xi)). 
Finally, as the index j is arbitrary over the set R, all the n, random variables 


converge almost surely to the set of eigenvalues of the random matrix —U*R{Xi)Ui/A{Xi). 
Besides, due to Lemma A.3, we have 

A(Ai) = 1 -h 2yXis{Xi) + X‘fymi{Xi) + acm3(Aj) 

(1 - a* - c)(l -7 aRy - 1))^ 


(oj — 1)(—1 -l- 2aj -f c -f <xf{y — 1)) 
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Step 2: Proof of the convergence (4.13) and structure of the random matrix 

R{Xi). In the second step, we aim to find the matrix limit of the block random matrix 
[U*Rn{Xi)U]i. First, we show [U*Rn{Xi)U]i equals to another random matrix [U*Rn{Xi)U]i, 
here Rn{Xi) is the type of random sesquilinear form. Then using the results in Bai and Yao 
(2008) (Proposition 3.1 and Remark 1), we are able to find the matrix limit of i?„(Aj). 

By assumption (b) that Xi = we have its first M components 

x^ = nl/^Si = u u*Si. 

y J 

Recall the definition of Rn{Xi) in (4.11), we have 

U*Rn{Xi)U 

( \ ( \ 


= U*^^ZiA{Xi)ZlU - ^ 


+ U*^^ZiC{Xi)SlU 

n 


-E|.] 


y \/hfc J 

y \foXk J 


U*SiB{Xi)SlU 


T 


y \/hfc J 

I\ 

U*SiD{Xi)ZlU 


y \/hfc j 


(4.14) 


Therefore, if we consider its Tth block that corresponds to the row and column index in 
the set Jj X Jp 


[U*Rn{Xi)U], 




-^U'Z^A(\i)ZlU 

'n 


V 

O-iA I rp 


Vt 


U*SiB{X^)SlU 


+ Xi\/ai\ / — 


-=U* ZiC{Xi)SlU 

'n 


+ Xi\/ai\ / — 


T 


p r 1 


^/T 


U*SiD{Xi)Z*U 


-m 

\xllu*Z,A{Xi)ZlU - a,X^U*S^B{Xi)SlU 

n 1 

+ \^^U*Z^C{X:)SIU + \^^U*S^D{Xi)ZlU 

n 1 


■E[-] 

U*{Z, Sr) 


:= [U*Rn{X{)U], 
= U*Rn{Xi)Ui , 


n n 

T T 


'Zf 

.SI 


U - E[-] 


(4.15) 
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where 


-Rn(Aj) 


{Zi S^) 


n 


^ C* ( A j ) 
n 


Aj-y/fliP-D (Aj) 

T 


-ai^B{Xi) 

T 



-E[.] , 


Finally, using Lemma A.6 in the appendix leads to the result. The proof of Theorem 4.1 
is complete. □ 


Next we consider a special case where flp is diagonal, whose eigenvalues being all simple. 
In other words, we have M = k and n* = 1 for all 1 < i < M. Hence U = Im- Following 
Theorem 4.1, we can derive the asymptotic normality for the normalised outlier eigenvalues 
of S 2 ^Si when |aj — 7| > 7 \/c + y — cy. 

Proposition 4.1. Under the same assumptions as in Theorem 3.1, with additional condi¬ 
tions that Vlp is diagonal and all its eigenvalues a* (1 <i < M) are simple, we have when 
~ 7| > 7 \/c y — cy, the outlier eigenvalue h of S^^Si is asymptotically Gaussian: 


Vp k 


ai{ai - 1 + c) 


CLi 




Af(0,7 


where 


2af{cy -c-y){ai- 1)^(-1 + 2a^ + c + af{y - 1)) 
{l + ai{y-l)Y 

, , «?(c + l/)(-l + 2ai + c + a2(|/-l))2 

+ (^4 - 3)---- . 


Remark 4.2. Notice that when the data are standard Gaussian, we have V 4 = 3, then the 
above theorem reduces to 




N 0, 


ai{ai - 1 + c) 
a* - 1 - Oiy 
2 aj{ai -lf{cy 


c - |/)(-l + 2ai + c + af{y - 1)) 


(1 + ai{y - 1 ))- 


which is exactly the result in Dharmawansa et ah (2014), see setting 1 in their Proposition 
11 . 


Proof, (of Proposition 4.1) Under the above assumptions, the random matrix — [U*R{Xi)U]i 
reduces to —[i?(Aj)]j. And since all the = 1, we have —[i?(Aj)]j equals the (i, i)-th element 

of —i?(Aj), which is a Gaussian random variable with mean zero and variance 


201 (174 — 3)cjj 


2 aj{ai + c-lf{cy-c-y) 0 ^( 0 * + c - l)2(c + y) 

-l + 2 a, + c + af{y-l) ^ {a, - ly 


Therefore, combining with (4.1) we have 

— 1 T c) 

Oj - 1 - aty 


\fP\k- 


Nyi.a)) , 
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where 


2af{cy -c-y){ai- + 2 a^ + c + aj{y - 1 )) 

{l + ai{y-l))^ 

^ . Qj (c + l/)(-l + 2ai + c + a}{y - 1))^ 



The proof of Proposition 4.1 is complete. 


□ 


5. Numerical illustrations 

In this section, numerical results are provided to illustrate the results of our Theorem 
4.1 and Proposition 4.1. We £x p = 200, T = 1000, n = 400 with 1000 replications, thus 
y = 1/2 and c = 1/5. The critical interval is then [7 — 7 -\/c + y — cy, 7 + 7 \/c + y — cy\ = 
[0.45,3.55] and the limiting support [bi,b] = [0.2,12.6]. Consider k = 3 spike eigenvalues 
(oi, 02 , Os) = (20, 0.2, 0.1) with respective multiplicity (ui, 77 . 2 , ns) = (1, 2,1). Let /i > • • • > 
Ip be the ordered eigenvalues of the Fisher matrix We are particularly interested in 

the distributions of li, (/p_ 2 , /p-i) and Ip, which corresponds to the spike eigenvalues Oi, 02 
and Os, respectively. 

5.1. Case of U = 14 ^ 

In this subsection, we consider a simple case that U = I 4 . Therefore, following Theorem 

4.1, we have 

• for j = l,p, ^/p{lj — N{0,af). Here, for j = 1, i = 1, (j){ai) = 42.67 

and af = 4246.8 + 1103.5(^4 — 3); and for j = p, i = 3, = 0.07 and = 

7.2 X 10-3 + 3.15 X 10-3(n4 - 3). 

• for j = p — 2,p — 1 and i = 2, the two dimensional random vector — 0 (a 2 )} 

converges to the eigenvalues of the random matrix —Here, 0 ( 02 ) = 0.13, 
A(A 2 ) = 1.45 and Rmn is the 2x2 symmetric random matrix, made with inde¬ 
pendent Gaussian entries of mean zero and variance 



202 + (^4 — 3)072 (= 0.04 -|- 0.016(774 — 3 )) , m = n , 
6*2 (= 0 . 02 ) , m ^ n , 


Simulations are conducted to compare the distributions of the empirical extreme eigenval¬ 
ues with their limits. 

5.1.1. Gaussian case 

First, we assume all the Zij and Wij are i.i.d. standard Gaussian, thus 774 — 3 = 0 . And 
according to (5.1), i?mn/\/ 0.04 is the standard 2x2 Gaussian Wigner matrix (GOE). 
Therefore, we have 
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Empirical Limit 




Figure 2: Upper panels show the empirical densities of /i and Ip (solid lines, after central¬ 
isation and scaling) compared to their Gaussian limits (dashed lines). Lower panels show 
contour plots of empirical joint density function of (/p_ 2 , Ip-i) (left plot, after centralisa¬ 
tion and scaling) and contour plots of their limits (right plot). Both the empirical and limit 
joint density functions are displayed using the two-dimensional kernel density estimates. 
Samples are from i.i.d. standard Gaussian distribution with U = I 4 with 1000 independent 
replications. 
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• -42.67} ^ Ar(0,4246.8) , 

• ^/p{lp - 0.07} ^ N{0, 7.2 X 10-3) , 

• The two-dimensional random vector ^{lp _2 — 0.13,/p_i — 0.13} converges to the 
eigenvalues of the random matrix —0.138 • W, here IT is a 2 x 2 GOE. 

Figure 2, upper panels, show the empirical kernel density estimates (in solid lines) of 
y/p{li — 42.67} and y/p{lp — 0.07} from 1000 independent replications, compared to their 
Gaussian limits N{0, 4246.8) and iV(0, 7 . 2 x lO-^), respectively (dashed lines). When consid¬ 
ering the empirical distribution of the two-dimensional random vector y^{/p _2 —0.13, Ip-i — 
0.13}, we run the two-dimensional kernel density estimation from 1000 independent repli¬ 
cations and display their contour lines, see the lower-left panel of the hgure, while the 
lower-right panel plot shows the contour lines of the kernel density estimation of the eigen¬ 
values of the 2x2 random matrix —0.138 • GOE (their limits). 

5.1.2. Binary case 

Second, we assume all the Zij and Wij are i.i.d. binary variables taking values {1, —1} 
with probability 1/2, and in this case we have U 4 = 1. Similarly, we have 

• ^/p{h - 42.67} ^ N{0, 2039.8) , 

• Vp{lp - 0.07} ^ iV(0, 9 X 10-^) , 

• The two-dimensional random vector ^{/p _2 — 0.13,/p_i — 0.13} converges to the 
eigenvalues of the random matrix —i?mn/1.45. Here, Rmn is the 2x2 symmetric 
random matrix, made with independent Gaussian entries of mean zero and variance 

Var(i?^„) I Q Q 2 ^ . 

Figure 3, upper panels, show the empirical kernel density estimates of y/p{li — 42.67} 
and ^/p{lp — 0.07} from 1000 independent replications (in solid lines), compared to their 
Gaussian limits (in dashed lines). Also, the lower panel on the hgure show the contour lines 
of the empirical joint density of the y/p{lp -2 — 0.13,/p_i — 0.13} (the left plot), with the 
right plot displaying the contour lines of their limit. 


5.2. Case of general U 


In this subsection, we consider the following non unit orthogonal matrix 


/I 0 0 

0 1 0 
0 0 

VO 0 ^ 


0 



(5.2) 
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Empirical Limit 




Figure 3: Upper panels show the empirical densities of /i and Ip (solid lines, after central¬ 
isation and scaling) compared to their Gaussian limits (dashed lines). Lower panels show 
contour plots of empirical joint density function of (/p_ 2 , Ip-i) (left plot, after centralisa¬ 
tion and scaling) and contour plots of their limits (right plot). Both the empirical and limit 
joint density functions are displayed using the two-dimensional kernel density estimates. 
Samples are from i.i.d. binary distribution with U = I 4 and 1000 independent replications. 
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i.e., we have 


n\ 






0 


1 

0 


0 

0 

, f/2 = 

0 

1 

, U3 = 

1 


P2 


72 

\v 


Vo 

73/ 


Vts/ 


Since Gaussian distribution is invariant under orthogonal transformation, we only consider 
the case that all the Zij and Wij to be i.i.d. binary variables taking values {1, —1} with 
probability 1/2, with all the other settings fixed as in Section 5.1. Then according to 
Theorem 4.1, we have 

• - 42.67} ^ N{0, 2039.8) , 

• Vp{lp - 0.07} ^ iV(0, 0.004) , 

• The two-dimensional random vector y^{/p _2 — 0.13,/p_i — 0.13} converges to the 
eigenvalues of the random matrix —f/2i?(A2)f/2/1.45. Here, R{X 2 ) is the 4x4 sym¬ 
metric random matrix, made with independent Gaussian entries of mean zero and 
variance 


Var(fl„„) = ( ™ J" ■ 

} 0.02 , m ^ n . 

Figure 4, upper panels, show the empirical kernel density estimates of y/p{li — 42.67} 
and ^/p{lp — 0.07} from 1000 independent replications (in solid lines), compared to their 
Gaussian limits (in dashed lines). Also, the lower panel of the hgure shows the contour 
lines of the empirical joint density of ^{/p _2 — 0.13, Ip-i — 0.13} (the lower-left plot), with 
the lower-right plot showing the contour lines of their limit. 


6 . Joint distribution of the outlier eigenvalues 


In the previous section, we have obtained the following result for the outlier eigenvalues: 
the rij-dimensional real random vector ^/p{lp,j — Xi,j G Ji} converges to the distribution of 
the eigenvalues of random matrix —U*R{Xi)Ui/A{Xi). It is in fact possible to derive their 
joint distribution, i.e. the limit of the M-dimensional real random vector 


^ \/p{^p,ji Ai,ji G t/i} 

\ \/P\-^P,jk Xl^,jf^ G Tfc} J 


( 6 . 1 ) 


Such joint convergence results are useful for inference procedures where consecutive sam¬ 
ple eigenvalues are used such as their differences or ratios, see e.g. Onatski (2009) and 
Passemier and Yao (2014). 

Theorem 6.1. Assume the same condition as in Theorem 4-i and that all the population 
spikes Qi satisfy the condition |aj — 7 I > + y — cy. Then the M-dimensional vector in 
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Figure 4: Upper panels show the empirical densities of /i and Ip (solid lines, after central¬ 
isation and scaling) compared to their Gaussian limits (dashed lines). Lower panels show 
contour plots of empirical joint density function of (/p_ 2 , Ip-i) (left plot, after centralisa¬ 
tion and scaling) and contour plots of their limits (right plot). Both the empirical and limit 
joint density functions are displayed using the two-dimensional kernel density estimates. 
Samples are from i.i.d. binary distribution with U given by (5.2) and 1000 independent 
replications. 
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(6.1) converges in distribution to the eigenvalues of the M x M random matrix 

0 \ 


A(Ai) 


0 


-UlR(\k)Uk 

A(Afe) 


( 6 , 2 ) 


where the matrices R{Xi), made with zero-mean independent Gaussian random variables, 
are defined in Theorem f.l, with the the following covariance function between different 
blocks (I s): for 1 < i < j < M, 


where 


Cov{R{Xi){i,j),R{Xs){i,j)) 


9{l, s) , i^j, 

uj{l, s){v 4 — 3) + 29(1, s) , i = j , 


9(1, s) = lim 


n + T 


trAn(Xi)An(Xs) , 


n+T 


Uj(l,s) = lim—— V+(A))(i,i) + (A^)(i,i) , 
n + J ^ 


2=1 


and +(A) is defined in (A.16). 

The proof of this theorem is very close to that of Theorem 2.3 in Wang et al. (2014), 
thus omitted. 

In principle, the limiting parameters 9(1, s) and u(l, s) can be completely specihed for a 
given spiked structure. However, this will lead to quite complex formula. Here, we prefer 
explain a simple case where Qp is diagonal whose eigenvalues |ai — 7 I > 7 \/c + y — cy (i = 
1, • • • , M) are all simple, we have U = Im, M = k and n* = 1 (i = 1, • • • , M). Therefore, 
U*R(Xi)Ui in (6.2) reduces to the (i,i)-th element of R(Xi), which is a Gaussian random 
variable. Besides, from Theorem 6.1, we see that the random variables {R(Xi)(i,i)}i=i^... 
are jointly independent since the index sets (i, i) are disjoint. Finally, we have the following 
joint distribution of the M outlier eigenvalues of 


Proposition 6.1. Under the same assumptions as in Theorem f.l, when Vtp is diagonal 
with all its eigenvalues \ai — 7 I > 7 \/c + y — cy being simple, the M outlier eigenvalues Ipj 
(j = 1, ■■■, M) of Sf^Si are asymptotically independent Gaussian: 


where 


1 VP+i-Ai) \ 


( 

Om, 

/+ •• 

■ ° 

\ - Am) / 


v 

lo ■■ 

• y 


2 ‘^a.iicy -c-y)(ai- 1)H-1 + 20 * + c + at(y - 1)) 
cr,- =- 


+ (^4 ~ 3) • 


(1 + ai(|/- 1))4 

a?(c + l/)(-l + 20 * + c + a‘f(y - 1))^ 


(1 + o*( 2 /- 1))4 
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7. Application to large-dimensional signal detection 

In this section, we develop an application of the previous results to an inference problem 
where spiked Fisher matrices arise naturally. In a signal detection equipment, records are 
of form 

Xi = Asi + ei, i = (7.1) 

where Xi is p-dimensional, s* is a A; x 1 low-dimensional signal {k p) with unit covariance 
matrix, A a. p x k mixing matrix, and (cj) is an i.i.d. noise with covariance matrix S 2 . 
Therefore, the covariance matrix of Xi can be considered as a fc-dimensional perturbation 
of S 2 , denoted as Sp in the following. Notice that none of the quantities in the r.h.s. of 
(7.1) is observed. One of the fundamental problem here is to estimate /c, the number of 
signals present in the system. This problem is challenging when the dimension p is large, 
say has a comparable magnitude with the sample size T. When the noise has the simplest 
covariance structure, i.e. S 2 = cig/p, this problem has been much investigated recently 
and several solutions are proposed, see e.g. Kritchman and Nadler (2008), Nadler (2010), 
Passemier and Yao (2012, 2014). However the problem with an arbitrary noise covariance 
matrix S 2 , say diagonal to simplify, remains unsolved in the large-dimensional context (to 
the best of our knowledge). Nevertheless, there exists an astute engineering device where 
the system can be tuned in a signal-free environment, for example in laboratory: that is we 
can directly record a sequence of pure-noise observations Zj, j = 1 ,... ,n, which have the 
same distribution as the (e^) above. These signal-free records can then be used to whiten 
the observations {xi) as follows. Let Si = ^2 = UA = 

1, • • • ,p be the eigenvalues of Si^^Si. Notice that the eigenvalues {/*} are invariant under 
the transformation Si ^ S 2 S 2 ^ S 2 ^'^^S' 2 S 2 they are in fact independent 

0 /S 2 . Therefore, these eigenvalues can be thought as if S 2 = Ip, that is becomes a 

spiked Fisher matrix as introduced in Section 2. This is actually the reason why the two 
sample procedure developed here can deal with an arbitrary covariance matrix of the noise 
while the existing one-sample procedures cannot. Based on Theorem 3.1, we propose our 
estimator of the number of signals as the number of eigenvalues of S^^Si larger than the 
right edge point of the support of its LSD: 

k = max{i : li> b + dn} , (7.2) 

where (dn) is a sequence of vanishing constants. 

Theorem 7.1. Assume all the spike eigenvalues a* (i = !,■■■ ,k) satisfy Oj > 7 -|- 
7 \/c + y — cy. Let dn be a seguence of positive numbers such that dn —)■ 0, dn ^ 0 and 
• dn —)■ +CX 0 as p ^ -l-cxo, then the estimator k is constant, i.e. k ^ k in probability as 
p -P- -|-oo. 

Remark 7.1. Notice here that there’s no need for those spikes a* to be simple, the only 
reguirement is that they should be properly strong enough {at > ■y + 7 \/c + y — cy) for 
detection. 
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Proof, (of Theorem 7.1). Since 

{k = k} = [k = max{i :li>b + dn}} = {Vj G {1, • • • , k}Jj > b + dn} {k+i < b + dn} , 
we have 

p{k = k} = pl f| {lj>b + dn}f]{lk+l<b + dn}] 

\l<j<k / 

= \ — P i < ^ + {4+1 > ^ + C^n} j 

k 

> 1 — P{lj < b + dn) — P{lk+l Pb + dn) ■ (7-3) 

i=i 

For j = 1, • • • , k, 

P{lj <b + dn) = P{yp{lj - </>(%•)) < y/p{b + dn- </>(%))) 

-> P[\/P{h - </>(ai)) < \/piP - </>(aj))) , (7.4) 

which is due to the assumption that y/p • ^ 0. Then the part ^/p{b — (fi^aj)) in (7.4) 

will tend to —oo since we have always 4>{aj) > b when a* > 7 + 7 \/c + y — cy. On the 
other hand, by Theorem 4.1, y/p{lj — </>(%)) in (7.4) has a limiting distribution; it is then 
bounded in probability. Therefore, we have 

P{lj < b + dn) —!■ 0 for j = 1, • • • , /c . (7.5) 

Also 

P{lk+1 >b + dn)= P{pPHlk+l -b)> p2/3 . dn) , 

and the part p‘^P{lk+i — b) is asymptotically Tracy-Widom distributed (see Bao et ah 
(2015) where the Tracy-Widom distribution for the largest eigenvalue of general sample 
covariance matrix is derived). As p‘^P ■ dn tend to inhnity as assumed, we have 

P{lk+i P b + dn) = 0 . (7.6) 

Combine (7.3), (7.5) and (7.6), we have P{k = A;} ^ 1 as p —?■ -|-cxd. The proof of Theorem 
7.1 is complete. □ 

We conduct a short simulation to illustrate the performance of our estimator. We hx 
y = 1/2 and c = 1/5 as in Section 5, and the value of p varies from 50 to 250, therefore, the 
critical value for in the model (2.4) (after whitening) is a* > 7(1 -|- y/c + y — cy} = 3.55. 
For each given pair of {p,n,T), we repeat 1000 times. The tuning parameter dn is chosen 
to be (loglogp)/p^/ 3 _ 
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Next, suppose k = 3 and A is a p x 3 matrix of form A = (yAAvi, y^U 2 ), where ci = 10, 
C2 = 5, 


ui = (l 0 • • • O)* and V 2 


/O 1/^2 1/^2 0 ■■■ OV 

Vo 1/V2 -i/\/2 0 ■■■ oy 


Besides, assume Cov(sj) = Ik. In this setting, we have two spike eigenvalues ci = 10, 
C 2 = 5 (before whitening) with multiplicity rii = 1, 77,2 = 2, respectively. Finally, we choose 
Cov(ej) to be either diagonal or non-diagonal as below. 


Table 1 

Frequency of our estimator in Model 1. 


p 

50 

100 

150 

200 

250 

n 

100 

200 

300 

400 

500 

T 

250 

500 

750 

1000 

1250 

k = 1 

0.038 

0.003 

0 

0 

0.001 

II 

0.578 

0.317 

0.166 

0.103 

0.047 

k = 3 

0.381 

0.675 

0.818 

0.883 

0.937 

k = A 

0.003 

0.005 

0.016 

0.014 

0.015 


• For Model 1: set Cov(ei) = diag(l, • • • , 1, 2, • — , 2( 1 . In this case, we have the three 

p/2 p/2 

non-zero eigenvalues of {ciViv{ + C 2 V 2 V 2 ) ■ [Cov(ei)]“^ equal 10, 5, 5, respectively, which 
are all larger than the critical value 3.55 — 1, therefore, the number of detectable sig¬ 
nals is three; 

• For Model 2: set Cov(ei) be compound symmetric with all the diagonal elements 
equal 1 and all the off-diagonal elements equal 0.1. In this case, we have for each 
given p, the three non-zero eigenvalues of {ciVivl + C 2 V 2 V 2 ) ■ [Cov(ej)]“^ are all larger 
than 5.36(> 3.55 — 1). The number of detectable signals is again three. 


Table 2 

Frequency of our estimator in Model 2. 


P 

50 

100 

150 

200 

250 

n 

100 

200 

300 

400 

500 

T 

250 

500 

750 

1000 

1250 

k = l 

0.016 

0 

0 

0 

0 

k = 2 

0.475 

0.186 

0.053 

0.028 

0.008 

k = 3 

0.505 

0.806 

0.926 

0.950 

0.971 

k = A 

0.004 

0.008 

0.021 

0.022 

0.021 


Tables 1 and 2 report the empirical frequency of our estimator k = 1,2, 3,4 in Model 1 
and Model 2, respectively, where the true number of signals is k = 3. Also, Figure 5 shows 
more clearly the trends of the frequency of correct estimation in both cases. We can see the 
frequency both increase as p gets larger, which conhrms the consistency of our estimator. 
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Model 1 


Model 2 



Figure 5: Frequency of true estimation k = k = 3. 
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Appendix A: Some lemmas 


Lemma A.l. Let R be a M x M real-valued matrix, U = (f/i ••• f/fc) and V = 
{Vi ■ ■ ■ Vk) are two orthogonal bases of some subspace E C of dimension M , where 
both Ui and Vi are of size M x Hi, satisfying ni J- • • • = M. Then the two Ui x n* matrices 

U*RUi and V*RVi have the same eigenvalues. 

Proof, (of Lemma A.l) It is sufficient to prove that there exists aniXUi orthogonal matrix 
A, such that V) = Cj-A. If it is true, then V*RVi = A*{U*RUi)A, and since A is orthogonal, 
we have the eigenvalues of V*RVi and U*RUi are the same. Now let Ui = (ui ■ ■ ■ Um) 
and Rj = (ui ••• UnJ . Define A = (a;s)i<z,s<ni, such that 


{ D — CLi\U\ + ■ ■ ■ a„. 

Uni UlniUl “!“••• arumtlrii 


Put in matrix form: 


(ui 


/ On 


J = (wi • • • Un,) 


Ulni 




V 


0 • • • a 


rii rii 




i.e. Vi = Ui ■ A. Since < Vi,Vj > = < a.i,a.j > by orthogonality of {uj}, where a.k = 
{0'ik)i<k<ni, therefore, the matrix A is orthogonal. □ 

Lemma A.2. Suppose X = (xi, • • • ,Xn) is a p x n matrix, whose columns {xi} are inde¬ 
pendent random vectors. Y = {yi, ■ ■ ■ , yn) is also similarly defined. Let Dp be the covariance 
matrix of Xi and yi, A is a deterministic matrix, then we have 


XAV* —^ trA ■ Dp . 

Moreover, if A is random but independent of X and Y, then we have 

AAY*— >EtrA-Tp. (A.l) 
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Proof. We consider the (i,j)-th entry of XAY*\ 

n n 

XAY\i, 3) = Y, X{z,k)A{k,l)Y*{l,j) = Y X^kY,lAul . (A.2) 

k^l=\ k^l=l 

Since XikYji T,p{i,j) when k = 1. Therefore, the right hand side of (A.2) tends to 
■ Ylk=i which is equivalent to say that 

XAY* ^tiA-Ep . 

Then (A.l) is simply due to the conditional expectation. The proof of Lemma A.2 is 
complete. □ 


In all the following, A refers to the outlier limit that 

a(a — 1 + c) 


Lemma A.3. ITe have 


s(A) = 


A = 


a(y - 1 ) + 1 


a — 1 — ay 


mi(A) = 


(a — l)(a + c — 1) ’ 

(a(y - 1) + + 2a + a^(y - 1) + y{c - 1)) 


m2{\) = - 
a 

msiX) = 

m4{\) = 


(a — l)^(a + c — 1)^(—1 + 2a + c + a'^{y — 1)) 
1 


-{a{y - 1) + 1)^ 


(a — 1)^(—1 + 2a + c + a‘^{y — 1)) ’ 
— 1 T 2a T c T ci^(—1 T c(jj — 1)) 


(a — 1)^(—1 + 2a + c + a'^(y — 1)) 

Proof, (sketch of the proof of Lemma A. 3) In this short proof, we skip all the detailed 
calculations. Recall the dehnition of s(z) in (3.13), its value at A is 

,(A) = , + . (A.3) 

^ (a - l)(a + c- 1) ^ ' 

Also, (3.13) says that 5 ( 2 ;) is the solution of the following equation; 

z{c + zy)^{z) + {c{z{l — y) + 1 — c) A 2zy)sfz) + c + y — cy = Q . (A.4) 


Taking derivatives on both sides of (A.4) and combing with (A.3) will give the value of 
s'(A). On the other hand, since it holds 

s(^) + -(1 - c) = cs{z) , (A.5) 

z 

see (3.12), taking derivatives on both sides again will give the value of s'(A). Finally, the 
above hve values is just a combination of s(A) and s'(A). 

The proof of Lemma A.3 is complete. □ 
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Lemma A.4. Under assumptions (a)-(d), 


p {\ n I 


a + c — 1 


Proof, (of Lemma A. 4) We first fix XZ 2 Z 2 ., then we can use the result in Zheng et al. 
(2013) (Lemma 4.3), which says that 

1/11 1 

-tr i - ■-Z 2 Z 2 -—X 2 X 2 ] -P-rh{z), a.s. 

p \z n T J 


where m{z) is the unique solution to the equation 

_ . . f 1 


m[z] = 


1 - ‘‘Fyin 


(A, 6 ) 


2 : l—cm{z) 


satisfying 

0 ^( 2 ;) • Q'(m(2;)) > 0 , 

here, Fy{x) is the LSD of -Z 2 Z 2 (deterministic), which is the standard M-P law with 
parameter y. Besides, if we denote its Stieltjes transform as 5 ( 2 ;) := J ^^dFy{x), then 
(A. 6 ) could be written as 


miz) = 


X — 


l—cm(z) 


dFy{x) = z-s 

' 1 — cm[zj 


(A.7) 


Since we know that the Stieltjes transform of the LSD of a standard sample covariance 
matrix satishes: 


s(2;) = 


1 


1-y- yzs{z) - z ’ 


(A.S) 


then we bring (A.7) into (A.S) leads to 

rh{z) 


1 — y — y ■ T- 

o o —, 


771(2) 


l—cm{z) z l—crh{z) 


whose nonnegative solution is unique, which is 

—l+y + z — zc+ {1 — y — z + zcY + Az{yc — y — c) 


miz = 


2(yc-y- c) 


(A.9) 


Therefore, we have for hxed -Z 2 Z 2 , 


-tr fx ■-Z 2 Z 2 -—X 2 X 2 ) ( \ I I 1 

p \ n T / \X J a + c—1 
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almost surely. Finally, due to the fact that for each oj, the ESD of ^^ 2^3 (a;) will tend to the 
same limit (standard M-P distribution), which is independent of the choice of u. Therefore, 
we have for all necessarily deterministic but independent of ^^ 2 X 2 ), 


- tr ( A ■ —Z 2 Z 2 — 
p \ n 


2^2 — 7fX2X^. 


-1 


-)■ 


a + c — 1 


almost surely. 

The proof of Lemma A.4 is complete. 

Lemma A.5. A(A), B{\), C'(A) and D{X) are defined in (4.5), then 


(/ - A) ■ -Z,A{l)Zl ^ (/ - A) • (1 + |/As(A)) • Im , 
n 

-ZfiA{l) - A{X)]Z; ^ (/ - A) • (A|/ 5 (A) + X^ymfiX)) ■ Im , 
n 


- B{X))Xl ^ -(/ - A) • cm 3 (A) • U 


V 




-ZfiC{l) - C(X))X* + L^Z,C(X)X* ^ (/ - A) ■ Omxm 

n ri. 


n 

^XfiD{l) - D{X))Zl + L^X,DiX)Zl ^ (/ - A) ■ Omxm 


u* 


5 


□ 

(A.IO) 

(A.ll) 

(A.12) 

(A.13) 

(A.14) 


Proof, (of Lemma A. 5) 

Proof of (A.IO): Since Zi is independent of A and Cov(Zi) = J^, we combine this fact 
with Lemma A. 2 : 

(/ - A) ■ -ZiA(/)Z* ^ (/ - A) ■ -Etr A(/) ■ Im • (A.15) 

n n 

Considering the expression of A(/), we have 


-Etr A(A) = -Etr 4 - Zl^XP - S)-^(-Z 2 Z;) 
n n I \n / n 

= l--Etr(A/„-5)-^ 
n 

= l-yX j 
= l + yXs{X) . 


Therefore, combine with (A.15), we have 

(/ _ A) . -Z,A{l)Z; ^ (/ - A)(l + |/As(A)) ■ Im • 
n 
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Proof of (A.11): Bringing the expression of A{1) into consideration, we first have 
A{1) - A{X) 


= z;{\ip - s)-^(-Z2Z;] ^-Z 2 - z;{iip - s)-^(-Z2Z;] ^ ^ 
\n / n \n / 


n 


-z;(xg-sYQz.zr'^-‘ 


n 


= (/ - A) ■ 


n 

-z;{\ip - s)-^(-Z2z;y'-Z2 + z;{xip - s)-\iip - s)-y-Z2z;) 

\n / n \n / 


-1 / 


Z2 + z;[{xip- s)-^ - {lip - s)-^] (-Z2Z;) -Z2 

Z2z * y - Z2 


n 


Then using Lemma A.2 for the same reason, we have 


-Zr[A{l) - A(A)]Z* ^ - {Etr {A{1) - A(A))} • Im , 
n n 


and 


-Etr {A{l)-A{X)) = (/-A) 
n 


--¥.ii\z;{xip-s)-y-Z2Z*] '-Z; 

n I \n / n 

+ -Etr I z;{xip - S)-\lIp -S)-y-Z2Z;y^-2 
n \ \n / n 


= {l-X)- 
= (/-A)- 


- -Etr(A/„ - S)-^ + -EtrfA/p - S)-"^ + o(l) 
n n 


y 


-dFc.,y{x) + Xy 


X — X 

= {l-X)- ys{X) + Xymi{X) + o{l) 


{X-x) 


-dFc,y{x) + o(l) 


Therefore, we have 
A 


n 


Zi[A(0 - A{X)]Zl ^ (/ - A) • (2/As(A) + yym^{X)) ■ Im ■ 


Proof of (A.12): 

First recall the fact that Cov(Xi) = U 
Lemma A.2,we have 


( 


ai 




U* and Xx is independent of B. Using 


\ flfc / 


1a-i(B(;) - B(X))Xl ^ lEtr(B(;) - B(A)) ■ 1] 




ax 


\ 


U* . 


\ dk j 
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The part 


Etr(5(/) -5(A)) 


T 

= -Etr <! X 


T 

= {l-X)- 
= (/ - A) ■ 


{lip - S)-^ - {XIp - S) 


-1 


-z^z;] '4x. 


n 


--Etr{(A/p-S)-''S}+o(l) 


—c 


X 


{X-xy 

= (/ - A) • {-cms{X) + o(l)) 


dFc^y{x) + o(l) 


Therefore, we have 


-Xi(5(/) - 5(A))X* ^ -c(/ - A)m 3 (A) • U 


( 0,1 

\ 


T 


\ 


Ok J 


U* . 


Proof of (A. 13) and (A. 14): (A. 13) and (A. 14) are derived simply due to the fact that 
Cov(Xi,Zi) = Omxm ■ 

The proof of Lemma A.5 is complete. □ 

Lemma A.6. Define 


Rn{X,) := (Zi ^i) 


aipC(Xi ) 


n 


n 


Xi^/^D(Xi) 

T 


T 



then Rn{Xi) weakly converges to a M xM symmetric random matrix R{Xi) = {Rmn), which 
is made with independent Gaussian entries of mean zero and variance 


where 


ar{RjYm) 


29i + (^4 — 3)a;j , m = n , 
Oi , m^n , 


af{ai + c-lf{c + y) 

{Oi - 1)2 

afiai + c- lf{cy-c-y) 
—1 + 2aj -\-c-\-o‘f{y — 1) 


Proof. Since Zi and Si are independent, having the same hrst four moments, both are 
made with i.i.d. components, we can now view [ Zi Si ^ as a. M x {n + T) table made 
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with i.i.d elements of mean 0 and variance 1. Besides, we can rewrite the expression of 
y4(A), -B(A), C(A) and -D(A) as follows: 

Ai(A) = 4 - z*Jx ■ -z^z; - ^X2X*X'-Z2 , 

B{X) = It + x*(a • Xz; - , 

C(A) = z*(a ■ Xz; - , 

D{X) = x; (x ■ Xz; - X^xX'-z^ . 

It holds 


A{xy = A{X) , B{xy = 5(A) , T • C(A)* = n ■ 5(A) 
therefore, the matrix 

Aiy^^C(Ai) 


n n 

Aiv^a^-D(Ai) 

T T 


is symmetric. Dehne 


An{Xi) — Vn + T • 


{Xi) 

n n 

Xi^/a:^D{Xi) -ai^B{Xi) 

T T 


(A.16) 


Now we can apply the results in Bai and Yao (2008) (Proposition 3.1 and Remark 1), which 
says that Rn{Xi) weakly converges to a M x M symmetric random matrix R{Xi) = {Rmn), 
which is made with i.i.d. Gaussian entries of mean zero and variance 

Var(fl„„) = I f ■ + ™ = " • 

^ ^ \ Oi , myn , 

The following is devoted to the calculation of the values of 9i and a;*. 

Calculating of From the definition of 9 (see Bai and Yao (2008) for details), we have 


9i = lim 


■iTAl{Xi 


= limtr 


= limtr 


n + T 

A^y aipC{Xi ) 
n n 

Ky/a^D{\i) 

T T 

"A\Xi) + ^C{Xi)D{Xi) 


\y/vA(\i) \iy/^C{\i) 

n n 

^iy/aFpD{Xi) -aiy/pB{Xi) 

T T 


eK 

r )2 


nT 


-Dixycixy + y^Byxy 


= lim 


PXi ^ ,, c(X.)D(K) + BHX.) 




nT 


(A.17) 
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tr 74^(Aj) = tr 


+ z;{xj, - s)-^ (-z,z;] ^z,z;{\j, 

\n J n 


s)-^ 



Z^Z*^ —Z 2 

J n 

= n + tr(AiJp - S)~‘^ - 2Xi tr(Ai/p - S)~^ 

= n + pXfmi{Xi) + 2pXis{Xi) , (A.18) 




-Z 2 Z; 


tTC{Xi)D{X,) = tr |z 2 *(A,Jp - S)-^ 

= tr(A,/p - ^)-'^(A,/p - S)-^ = pm^iX,) 


-X2X;{XJ, 



(A.19) 


tr i?^(Aj) = tr 


It + x;{xjp - S)-^ ^X2x;{xjp 

+ 2 X*(A,Jp - (^^2^2) ^^2 



= T + tr(A,/p - S)-^F{X^Ip - S)-^S + 2tiiXJp - S)-^S 
= T + pm4{Xi) + 2pm2{Xi) , (A.20) 


Combining (A.17), (A.18), (A.19) and (A.20), we have 

6i = X‘^y{l + yX^mi{Xi) + 2yXis{Xi)) + 2X‘faiCym3{Xi) + a^c(l + cm 4 (Ai) + 2cm2{Xi)) 
^ a-(ai + c-lf{cy -c-y) 

—1 + 2aj + c + a?(|/ — 1) 

Calculating of up. 


n+T 


Ui = lim 


n + T 


^(A„(Ai)(z,i))^ = lim 


2 = 1 


2-^ n'i 


A^i 


2=1 


2=1 


(A. 21 ) 


In the following, we will show that A{i,i) and both tend to some limits that is 

independent of i. 




1 - 


Xjp-{^Z2Z;) "^X2X; 


\-Z 2 Z;) '^^2 
'n n 


{t,i) 


A* 


r 1 1 1 

-1 

1 - — 

n 

Z *2 

\ 7 V* "V "V* 

Ai ■ —^ 2^2 — —A2A2 

in 1 \ 

^2 


(A.22) 


















Q. Wang and J. Yao/Large and spiked Fisher matrices with application 


39 


If we denote rji as the i-th column of Z 2 , we have 

fvl\ 

■ Vn) ■ 

\n:J 


-Z 2 Z 2 = - {r]i 
n n ' 


= + -Z2iZ2i , 

n n 


where Z 2 i is independent of rji. Since 


1 1 \-i / 1 1 

A, • -Z 2 Z; - -X 2 X;) - (a, • -Z2^z;, - -X 2 X; 

= -(a.. hz^zi - (a. ■ iz 2 .z|. - Xxi 

\ n I / n \ n I 


we have 


1 1 N -1 

A fy rv^ -y- y-* 

I * —^2^2 “ 

n 1 


\ \ ry ry^ \ -y-if. 

Ai * ^Zj2i^2i ^*^ 2^2 


-1 


1 + XV: {K ■ lZ2^Z*,, - iXaXi) 


Bringing (A.23) into (A.22), 


A{i,i) = 1 - —r]* 


n 


A rv ry-if. 

i ■ —^2^2 “ 

n 1 


-1 


Vi 


= 1 - 


^r^:(\xz,,z;^-zx2x; 


-1 


Vi 


1 + xvl (a* ■ ^^2*^2** - 


(A.23) 


-1 ’ 

2 


1 + ^v*i [k ■ ^^2*^2* - ^^ 2 ^ 2 *) n 

whose denominator of (A.24) equals 


1 + ^ tr (a. . iz2.Zi - 

n \ n I / 


(A.24) 


Since rji is independent of ^Aj • A^ 2 j^ 2 i “ ^^-^ 2 -^ 2 ^ > (A.24) converges to the value 1 + 

^iV ■ a +c-i according to Lemma A.4. Therefore, we have 

1 


A{i, i) —)■ 

which is independent of the choice of i. 
For the same reason, we have 


1 + Ail/ • 


(A.25) 


ai+c-l 


= 1 + 
= 1 + 




X* 


xjp-{-Z2z;) ^^X2x;]~\-Z2z;) 'ix2 


'y v* ^ w w* 
i ■ —Z2Z2 — —A2A2 

n 1 


{lA) 


-1 1 


:Xo 


A A) 


(A.26) 
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If we denote 5* as the i-th column of X 2 , then we have 

f6l\ 

= ... St)- 


VtJ 


= ^StS; + , 


and 


1 1 \ -1 

A /7 /7=(: 

i ■ —Zj 2^2 ~ 
n 1 


1 1 N -1 

A ry ry-^ -ry 

i ■ —^2^2 ~ 7F^2i^2i 
n 1 


1 


1 


^ 2^2 “ ;7;^2-^2 
n 1 


-1 1 


= f A, ■ - ^X2X^ ) yy:5i5* \i ■ -Z 2 Z* - ^X2,X; 


n 


J2^2 


So we have 


1 1 N -1 

A ry ryi^ -y 

i • —^2^2 ~ 

n I 


\ \ ry ry-if 1 y y:^: 

X ■ T^2i^2i 


^2i^2i 


-1 


-1 


1-^5* A,-iZ2Z*-iX2,X*J <5,: 


-1 


(A.27) 


Combine (A.26) and (A.27), we have 


B{i,i) = i+5: 


A ry ry^ y y^ 

i ■ —^2^2 — —A 2 A 2 

n 1 


- 1 1 


T 


= 1 + 


1 A* I \ 17 '7* 1 V V* 

T^i ■ n'^^2 ~ T^2i-^2i 


-1 


l-^5*(K-^Z2Z*^-^X2iX*.] 6, 


2 T" 


1 - ^5* (A, • iZaZI - ^X2,X ;^) ' 5 ,; 


(A.28) 


Using the independence between 5* and ^Aj ■ ^Z 2 Z 2 — ^X 2 iX 2 i^ and Lemma A.4 again, 
we have 


(a, . hz,z; - 4A'2.A'*,)'Ai ^ c . 


a, + c — 1 


Therefore, we have 




c ’ 


ai+c-l 


which is also independent of the choice of i. 

Finally, taking the dehnition of ooi in (A.21) into consideration, we have 


OOi = 


Yu 


afc 


1 + yX 


1 


* ai+c—l 


ai+c-l 


aj{ai + c-lf{c + y) 

(Oi - 1)2 


The proof of Lemma A.6 is complete. 


(A.29) 


□ 

















